A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Ganatra, Amit
- A Survey on Web Personalisation and Recommendation Techniques
Authors
1 Dharamsinh Desai University, Nadiad, District Kheda, Gujarat, IN
2 CHARUSAT, Changa, District Anand, Gujarat, IN
Source
International Journal of Knowledge Based Computer System, Vol 2, No 2 (2014), Pagination: 13-19Abstract
The quantity of accessible information on the web continues to grow rapidly and has exceeded human processing capabilities. The sheer amount of the information increases the complexity for users from discovering desired information. Recommendation systems have become a valuable resource for users seeking intelligent ways to search through enormous volume of information available to them. Web logs are important information repository which records users activates on search results. The mining of these logs can improve the performance of search engines, since user has a specific goal when searching for information. In this paper, a survey is provided on the different recommendation techniques with their advantage and drawbacks. A brief comparison of different personalisation techniques based on certain parameters is done.Keywords
Log Mining, Personalisation, Recommendation Techniques, Web Usage Mining.- Learning Using Heterogeneous Classifier in Data Mining
Authors
1 Chandubhai S Patel Institute of Technology Changa, Gujarat, IN
2 Chandubhai S Patel Institute of Technology, Changa, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 13 (2011), Pagination: 788-792Abstract
Data Mining can be considered an analytic process designed to explore business or market data to search for consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. Data mining is useful for prediction. We can improve accuracy of different classifiers by combining various classifiers and taking their predictions. One such method is Stacking, an ensemble method in which a number of base classifiers are combined using one meta-classifier which learns their outputs. This enhances the benefits obtained by individual classifiers. This paper is a review work of different approaches proposed by various authors in their paper.Keywords
Ensemble of Classifiers, Bagging, Boosting, Staking, Troika.- Improved K-Means with Dimensionality Reduction Technique
Authors
1 Charotar Institute of Technology Changa, Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 12 (2011), Pagination: 722-725Abstract
Clustering is the process of finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups. K-means is a well known partitioning based clustering technique that attempts to find a user specified number of clusters represented by their centroid. K-means clustering algorithm often does not work well for high dimension; hence, to improve the efficiency, we apply PCA, dimensionality reduction technique, on data set and obtain a reduced dataset containing possibly uncorrelated variables. The challenging task for any clustering method is to determine the number of clusters beforehand. To find the number of cluster, we apply EM method that finds number of clusters user should choose by determining a mixture of Gaussians that fit a given data set. Finally the experiment results shows that the use of techniques such as PCA and EM, improve the efficiency of K-means clustering.Keywords
Cluster, EM, K-Mean, PCA.- Scientific Understanding, Experimental Analysis and a Survey on Evolution of Classification Rule Mining Based on Ant Colony Optimization
Authors
1 Department of Computer Engineering CIT-Changa, Gujarat, IN
2 Department of Computer Engineering, Dharmsinh Desai University Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 2 (2011), Pagination: 82-89Abstract
Given the explosive rate of data deposition on the web; classification has become a complex and dynamic phenomenon. As classification complexity is continuing to grow, so is the need in direct proportion to designing and developing data mining algorithms & techniques. Classification is the most commonly applied data mining technique, a process of finding a set of models or functions that describes and distinguishes data classes, for the purpose of using it – so classification is a specialist with specialized skills, which is moving toward universality. A classification problem is considered as a supervised learning problem. The aim of the classification task is to discover a kind of relationship between the attributes (input) and class (output), so that the discovered knowledge can be used to predict the class of a new unknown object. Classification of the records or data is done based on the classification rules. Ant colony optimization is a method that derives its inspiration from real ants that forage for food by selecting the shortest path from multiple possible paths available to reach food. Thus merging the concept of Ant Colony Optimization (ACO) with data mining brings in a new approach to designing classification rule that will be helpful in extraction of information for a specialized dataset. In this paper a survey is done on Ant-miner algorithm for classification Rule extraction. The Ant miner algorithm extract classification rule from data using if-then-else pattern; similar to other traditional algorithm available for classification task or purposes. Extraction of classification Rule from data is an important task of data mining. We present, detailed description about the algorithm available for classification rule mining using Ant colony optimization. Variations to the ant colony based an Ant-miner algorithm is discussed along with the comparison of the algorithms with critical parameters like predictive accuracy, No. of Rules Discovered, No. of terms per No. of rules Discovered, using different data sets. Hence the paper will help to study various ant miner algorithms and comparison carried out will help the data miner to select and use algorithm according to need based on the specialized properties associated with the algorithm.Keywords
Ant Colony Optimization (ACO), Classification, Data Mining.- Incremental Discretization for Naïve Bayes Learning with Optimum Binning
Authors
1 Charotar University of Science and Technology, Changa, Gujrat, IN
2 Charotar University of Science and Technology Changa, Gujrat, IN
3 Department of Computer Engineering, Dharamsinh Desai University, Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 4 (2011), Pagination: 266-271Abstract
Incremental Flexible Frequency Discretization (IFFD) is a recently proposed discretization approach for Naïve Bayes (NB).IFFD performs satisfactory by setting the minimal interval frequency for discretized intervals as a fixed number. In this paper, we first argue that this setting cannot guarantee that the selecting MinBinSize is on always optimal for all the different datasets. So the performance of Naïve Bayes is not good in terms of classification error. We thus proposed a sequential search method for NB: named Optimum Binning. Experiments were conducted on 4 datasets from UCI machine learning repository and performance was compared between NB trained on the data discretized by OB, IFFD, and PKID.
Keywords
Discretization, Naïve Bayes, Optimum Binning.- Classification using Generalization Based Decision Tree Induction along with Relevance Analysis Based on Relational Database
Authors
1 Charotar Institute of Technology Changa, Gujarat, IN
2 Charotar Institute of Technology, Changa, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 2, No 10 (2010), Pagination: 287-293Abstract
Classification is a process of sorting unknown values of certain attributes-of-interest based on the values of other attributes, and is a major challenge in data mining. A commonly used method is the decision tree. The efficiency of decision tree algorithms has been well established for relatively small data sets. However, this method of classification has problems when handling larger data sets, data having continuous numerical values, and has the tendency to favor multiplicity in terms of values associated with the attributes in the data set while making selection of the final determining attribute. In data mining applications, large training sets are common; therefore decision tree algorithms have limitations of scalability. Also in most data mining application, users have a little knowledge regarding which signature attribute should be selected for effective mining and the user is more dependent upon the capability of the algorithm. In this paper, we address selection of two things, one, the right signature attribute and the second, handle large data set. This we accomplish by proposing a new data classification method through integration of a set of sequential process that involves steps such as data cleaning; attribute oriented induction (identifying the signature attribute), relevance analysis as the preprocessing steps followed by induction of decision trees. This stepwise approach helps us to set simple extraction rules at multiple levels of abstraction and easily handles large data sets and continuous numerical values in a scalable way.Keywords
Data Mining, Classification, Data Cleaning, Decision Tree Induction, Relevance Analysis.- An Improved Expectation Maximization based Semi-Supervised Text Classification using Naïve Bayes and Support Vector Machine
Authors
1 Department of Computer Engineering, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
2 Department of Information and Technology, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
3 U & PU Patel Department of Computer Engineering, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
Source
Artificial Intelligent Systems and Machine Learning, Vol 4, No 5 (2012), Pagination: 330-335Abstract
With the development of Internet and the emergence of a large number of text resources, the automatic text classification has become a research hotspot. As number of training documents increases, accuracy of Text Classification increases. Traditional classifiers (Supervised learning) use only labeled data for training. Labeled instances are often difficult, expensive, or time consuming to obtain. Meanwhile unlabeled data may be relatively easy to collect. Semi-Supervised Learning makes use of both labeled and unlabeled data. Several researchers have given algorithms for Text Classification using Semi-Supervised Learning. But still improving accuracy of Text Classification using Semi-Supervised Learning is a challenge. In the iterative process in the standard Expectation Maximization (EM) based semi-supervised learning, some unlabeled samples are misclassified by the current classifier because the initial labeled samples are not enough. To overcome this limitation, an EM based Semi-Supervised Learning algorithm using Naïve Bayesian and Support vector machine is proposed in this paper to improve accuracy of text classification using semi-supervised learning.Keywords
Expectation Maximization (EM), Naïve Bayes (NB), Support Vector Machine (SVM), Semi-Supervised Machine (SSL).- Spiking Back Propagation Multilayer Neural Network Design for Predicting Unpredictable Stock Market Prices with Time Series Analysis
Authors
1 Patel Department of Computer Engineering, Charotar University of Science & Technology, CHARUSAT, Gujarat, IN
Source
Artificial Intelligent Systems and Machine Learning, Vol 2, No 9 (2010), Pagination: 202-212Abstract
Stock prediction is, so far, one of the popular topics not only for research purposes but also for commercial applications. Owing to its importance, a well-established school of concepts and techniques, including fundamental and technical analysis, has developed in recent decades. However, because these techniques or tools are based on totally different analytical approaches, they often yield contradictory results. More importantly, these analytical tools are heavily dependent on human expertise and justification in areas such as the location of reversal (or continuation) patterns, market patterns, and trend prediction. Predicting stock data with traditional time series analysis has proven to be difficult. An artificial neural network may be more suitable for the task primarily because no assumption about a suitable mathematical model has to be made prior to forecasting. With their ability to discover patterns in nonlinear and chaotic systems, neural networks offer the ability to predict market directions more accurately than current techniques. Furthermore, a neural network has the ability to extract useful information from large sets of data, which often is required for a satisfying description of a financial time series. Our focus of study is to build neural network for stock market prediction. We propose to study feed forward back propagation network and their predictive accuracy. We propose to study architecture model of neural network and its different network parameters. The study attempts to understand network parameter like momentum, learning rate, number of neurons etc. We will compare architecture and result of above models. Our aim is to build best model by studying various parameters of the neural network. And also study other related model to compare accuracy of the model. In this study we have used R tool to implement the neural network. We have taken closing price, turnover, global indices, interest rate, and inflation as a neural network input. We proposed to include other indicator like news, currency rate, and crude price as input to the neural network. We compared stock prediction accuracy by setting different network parameters. Subsequently, an attempt is made to build and evaluate a neural network with different network parameters. Technical as well as fundamental data are used as input to the network. In benchmark comparisons, the price prediction proves to be successful.Keywords
Classification, Neural Network, Feature Selection, Prediction, Stock Market.- Initial Classification through Back Propagation in a Neural Network Following Optimization through Ga to Evaluate the Fitness of an Algorithm
Authors
1 Department of Computer Engineering, Charotar Institute of Technology, Charotar University of Science and Technology, Changa, Anand-388 421, IN
2 Information Technology Department, Charotar Institute of Technology, Charotar University of Science and Technology, Changa, Anand-388 421, IN